Social Agents Playing a Periodical Policy
Identifieur interne : 002462 ( Main/Exploration ); précédent : 002461; suivant : 002463Social Agents Playing a Periodical Policy
Auteurs : Ann Nowé [Belgique] ; Johan Parent [Belgique] ; Katja Verbeeck [Belgique]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2001.
Abstract
Abstract: Coordination is an important issue in multiagent systems. Within the stochastic game framework this problem translates to policy learning in a joint action space. This technique however suffers some important drawbacks like the assumption of the existence of a unique Nash equilibrium and synchronicity, the need for central control, the cost of communication, etc. Moreover in general sum games it is not always clear which policies should be learned. Playing pure Nash equilibrium is often unfair to at least one of the players, while playing a mixed strategy doesn’t give any guarantee for coordination and usually results in a sub-optimal payoff for all agents. In this work we show the usefulness of periodical policies, which arise as a side effect of the fairness conditions used by the agents. We are interested in games which assume competition between the players, but where the overall performance can only be as good as the performance of the poorest player. Players are social distributed reinforcement learners, who have to learn to equalize their payoff. Our approach is illustrated on synchronous one-step games as well as on asynchronous job scheduling games.
Url:
DOI: 10.1007/3-540-44795-4_33
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 001674
- to stream Istex, to step Curation: 001319
- to stream Istex, to step Checkpoint: 001C99
- to stream Main, to step Merge: 002515
- to stream Main, to step Curation: 002462
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Social Agents Playing a Periodical Policy</title>
<author><name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
</author>
<author><name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
</author>
<author><name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:B9569E9E27EC810019E7B082F8F573E317E9C04A</idno>
<date when="2001" year="2001">2001</date>
<idno type="doi">10.1007/3-540-44795-4_33</idno>
<idno type="url">https://api.istex.fr/document/B9569E9E27EC810019E7B082F8F573E317E9C04A/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001674</idno>
<idno type="wicri:Area/Istex/Curation">001319</idno>
<idno type="wicri:Area/Istex/Checkpoint">001C99</idno>
<idno type="wicri:doubleKey">0302-9743:2001:Nowe A:social:agents:playing</idno>
<idno type="wicri:Area/Main/Merge">002515</idno>
<idno type="wicri:Area/Main/Curation">002462</idno>
<idno type="wicri:Area/Main/Exploration">002462</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Social Agents Playing a Periodical Policy</title>
<author><name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
<affiliation wicri:level="1"><country xml:lang="fr">Belgique</country>
<wicri:regionArea>Computational Modeling Lab</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country xml:lang="fr">Belgique</country>
<wicri:regionArea>Vrije Universiteit Brussel</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Belgique</country>
</affiliation>
</author>
<author><name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<affiliation wicri:level="1"><country xml:lang="fr">Belgique</country>
<wicri:regionArea>Computational Modeling Lab</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country xml:lang="fr">Belgique</country>
<wicri:regionArea>Vrije Universiteit Brussel</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Belgique</country>
</affiliation>
</author>
<author><name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
<affiliation wicri:level="1"><country xml:lang="fr">Belgique</country>
<wicri:regionArea>Computational Modeling Lab</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country xml:lang="fr">Belgique</country>
<wicri:regionArea>Vrije Universiteit Brussel</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Belgique</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2001</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">B9569E9E27EC810019E7B082F8F573E317E9C04A</idno>
<idno type="DOI">10.1007/3-540-44795-4_33</idno>
<idno type="ChapterID">Chap33</idno>
<idno type="ChapterID">33</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Coordination is an important issue in multiagent systems. Within the stochastic game framework this problem translates to policy learning in a joint action space. This technique however suffers some important drawbacks like the assumption of the existence of a unique Nash equilibrium and synchronicity, the need for central control, the cost of communication, etc. Moreover in general sum games it is not always clear which policies should be learned. Playing pure Nash equilibrium is often unfair to at least one of the players, while playing a mixed strategy doesn’t give any guarantee for coordination and usually results in a sub-optimal payoff for all agents. In this work we show the usefulness of periodical policies, which arise as a side effect of the fairness conditions used by the agents. We are interested in games which assume competition between the players, but where the overall performance can only be as good as the performance of the poorest player. Players are social distributed reinforcement learners, who have to learn to equalize their payoff. Our approach is illustrated on synchronous one-step games as well as on asynchronous job scheduling games.</div>
</front>
</TEI>
<affiliations><list><country><li>Belgique</li>
</country>
</list>
<tree><country name="Belgique"><noRegion><name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
</noRegion>
<name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
<name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Musique/explor/MozartV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002462 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002462 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Musique |area= MozartV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:B9569E9E27EC810019E7B082F8F573E317E9C04A |texte= Social Agents Playing a Periodical Policy }}
This area was generated with Dilib version V0.6.20. |